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Abstract 

We prove several results from different areas of extremal combinatorics, giving complete or 
partial solutions to a number of open problems. These results, coming from areas such as extremal 
graph theory, Ramsey theory and additive combinatorics, have been collected together because in 
each case the relevant proofs are quite short. 

1 Introduction 

We study several questions from extremal combinatorics, a broad area of discrete mathematics which 
deals with the problem of maximizing or minimizing the cardinality of a collection of finite objects 
satisfying a certain property. The problems we consider come mainly from the areas of extremal graph 
theory Ramsey theory and additive combinatorics. In each case, we give a complete or partial solution 
to an open problem posed by researchers in the area. 

While each of the results in this paper is interesting in its own right, the proofs are all quite short. 
Accordingly, in the spirit of Alon's 'Problems and results in extremal combinatorics' papers OH], we 
have chosen to combine them. We describe the results in brief below. For full details on each topic 
we refer the reader to the relevant section, each of which is self-contained and may be read separately 
from all others. 

In Section [21 we improve a result of Alon [3] on the size of the largest induced forest in a bipartite 
graph of given average degree. In Section [3l we prove a conjecture of Balister, Lehel and Schelp [7J on 
Ramsey saturated graphs. We study the relationship between degeneracy and online Ramsey games in 
Section^ addressing a question raised by Grytczuk, Haluszczak and Kierstead [25]. In Section [5l we 
improve a recent result of Dudek and Mubayi [T7J on generalized Ramsey numbers for hypergraphs. 
We prove a conjecture of Cavers and Verstraete [11] in Section [6] by showing that any graph on n 
vertices whose complement has o(n 2 ) edges has a clique partition using o(n 2 ) cliques. In Section [71 we 
improve a result of Hegyvari [28J on the size of the largest Hilbert cube that may be found in a dense 
subset of the integers. 

All logarithms are base 2 unless otherwise stated. For the sake of clarity of presentation, we system- 
atically omit floor and ceiling signs whenever they are not crucial. We also do not make any serious 
attempt to optimize absolute constants in our statements and proofs. 
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2 Induced forests in sparse bipartite graphs 



Every bipartite graph trivially has an independent set with at least half of its vertices. Under what 
conditions can we find a considerably larger sparse set? A forest is a graph without cycles. Akiyama 
and Watanabe pQ and, independently, Albertson and Haas [2] conjectured that every planar bipartite 
graph on n vertices contains an induced forest on at least 5n/8 vertices. Motivated by this conjecture, 
Alon [3] considered induced forests in sparse bipartite graphs, showing that every bipartite graph on n 
vertices with average degree at most d > 1 contains an induced forest on at least + e~ bd,2 )n vertices, 
for some absolute constant b > 0. On the other hand, there exist bipartite graphs on n vertices with 
average degree at most d > 1 that contain no induced forest on at least (| + e~ b v ^)n vertices. Alon 
remarks that it would be interesting to close the gap between the lower and upper bounds for this 
problem. We improve the lower bound here to + d~ bd )n. 

In particular, since the average degree of any planar bipartite graph is less than 4, there is an absolute 
positive constant e such that every planar bipartite graph on n vertices contains an induced forest 
on at least (1/2 + e)n edges. This gives some nontrivial result on the question raised in [H[2]. More 
recently, it was shown in [31] (see also [30]) that every triangle- free (and hence every bipartite) planar 
graph on n vertices contains an induced forest on at least 71n/128 vertices. 

As in Alon's proof, we show that every sparse bipartite graph contains a large induced subgraph whose 
connected components are stars. 

Theorem 2.1 Let d be a positive integer. Every bipartite graph G onn vertices with average degree at 
most d contains an induced subgraph on at least (^ + 8)n vertices with 5 = (2 7 d 2 )~ 4d whose connected 
components are stars. 

Proof: Suppose, for contradiction, that the theorem is false. Let X and Y denote a bipartition of G 
into independent sets with \X\ < \Y\. We may assume |n < \Y\ < {\ + 5)n as Y is an independent 
set and hence induces a star-forest. We have |X| > (1/2 — 5)n > n/4. 

We will construct a sequence Yq C Y\ C . . . C Y±d of nested subsets of Y. Let Yq = 0. Once Yi has 
been defined, let <5j+i = 6 + \ Yi\/n, di + ± = 1/ (2 7 dJj + i) and Yj+i consist of those vertices in Y with 
degree at least di + \. As G has at most dn/2 edges and every vertex in Yi has degree at least di, we 
have \Yi\di < dn/2 and hence |ii|/n < d/(2di). We therefore have 5\ = 5 and 

<S + d/{2di) = 5 + (d/2)(2 7 d5i) = 5 + 2 6 d 2 5i < 2 7 d 2 5i. 

Hence, for i > 1, by induction on i we have Si < (2 7 d?) l ~ l 5 and di > 1 / '((2 7 d 2 ) 1 5) . 

Let Ci denote the number of edges containing a vertex in Yi \ Y^_\. The following claim completes the 
proof by contradiction, as the number of edges of G is at most dn/2 and at least 

4d 

> Ad-n/8 = dn/2. 

i=i 

Claim 1 For i > 1, e, > n/8. 

Indeed, suppose ei < n/8. Let Ij C I consist of those vertices not adjacent to any vertex in 
Yi \ Yi—x- We have \Xi\ > \X\ — > n/8. Let X[ C A, consist of those vertices of degree at most 
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8d. We have \Xl\ > |Aj|/2 > n/16 as otherwise the number of edges of G, which is at most dn/2, 
is more than |Xj \ ^|8d > (n/16)8cZ = dn/2, a contradiction. Pick out vertices x\, ■■■ ,Xt from X 4 ' 
greedily as follows. We pick xj arbitrarily from the remaining vertices, and delete from X[ all vertices 
which share a neighbor with xj not in Since x,- has degree at most 8d and all its neighbors 

not in Yi-i have degree at most d% > 1 (recall that every vertex in Xi is chosen so as not to have 
neighbors in y\y_i), there are at most 8d(di — 1) vertices in X[ which share a neighbor with Xj not 
in Yi_\. We get t > |X^|/(8ddj) > n/(2 7 ddi) = J/n and the induced subgraph of G with vertex set 
{xi, . . . , xt} U (y \ y_i) is a star-forest with at least J^n + \Y\ — |y_i| = 5n + |y| > (| + <5)n vertices. 
This verifies the claim and completes the proof of the theorem. □ 

A much better bound may be proved for regular graphs. Indeed, it is shown in [B] that every d-regular 
bipartite graph on n vertices contains an induced forest with at least (| + Wd-Tp ) n ver ti ces - Moreover, 
this result is sharp in its dependence on d. It seems to us that the bound given in Theorem 1 2 . 1 1 should 
also be close to best possible. It would, for example, be very interesting to improve Alon's upper 
bound to show that there exist bipartite graphs on n vertices with average degree at most d > 1 that 
contain no induced forest on at least (| + e~ b ' d )n vertices. 

3 Ramsey saturated graphs 

The Ramsey number r(G) of a graph G is the smallest natural number N such that every two-coloring 
of the edges of the complete graph contains a monochromatic copy of G. The fact that these 
numbers exist was first proved by Ramsey [39]. 

Following Balister, Lehel and Schelp [7], we say that a graph G on n vertices is Ramsey unsaturated 
if there exists an edge e G E{K n )\E(G) such that r{G + e) = r(G). However, if r{G + e) > r(G) for 
all edges e G E(K n )\E(G), we say that G is Ramsey saturated. 

There are many open questions about Ramsey saturated and unsaturated graphs. For example, it is 
not even know whether K n — e is saturated, that is, whether r(K n ) > r(K n — e), for any n > 7, though 
Balister, Lehel and Schelp conjecture that this should be the case. 

One result proved by Balister, Lehel and Schelp [7j is that there are at least |_(tt. — 2)/2j non-isomorphic 
Ramsey saturated graphs on n vertices. Moreover, they conjectured that there should be c > and 
e > for which there are at least cn 1+£ non-isomorphic Ramsey saturated graphs on n vertices. Here 
we prove this conjecture in a strong form, as follows. 

2 

Theorem 3.1 There exists c > such that there are at least 2 cn non-isomorphic Ramsey saturated 
graphs on n vertices. 

The proof of this theorem is a straightforward combination of two results from graph Ramsey theory. 
The first is a standard lower bound for the Ramsey number of a graph with n vertices and m edges 
which follows from the probabilistic method. 

Lemma 3.1 For any graph G with n vertices and m edges, the Ramsey number r{G) satisfies 

r(G) > 2^~\ 

Proof: Let N = 2~~ and color the edges of Kn at random, each edge being red or blue with 
probability ^. Let X be the random variable counting the number of monochromatic copies of G. 
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Then 

E[X] < 2 1 ~ m N n = 2 l - rn {2^- l ) n = 2 l ~ n < 1. 

It follows that there must be some coloring of Kn which does not contain a monochromatic copy of 
G. □ 

The second lemma we require is an upper bound for the Ramsey number of graphs with n vertices 
and m edges. The following result |13[ [22] is sufficient for our purposes, though other results |15[ |4"5] 
could also be used instead. 

Lemma 3.2 For any bipartite graph G with n vertices and maximum degree A, the Ramsey number 
r(G) satisfies 

r(G) < A2 A+5 n. 

Proof of Theorem I3.lt We will show, for n sufficiently large, that there are at least 2 n2//80 non- 
isomorphic Ramsey saturated graphs. Consider a random labeled bipartite graph G\ between two sets 
A and B, each of size n/2, containing | (^) = ^ edges. By a standard large deviation inequality for 
the hypergeometric distribution, for n sufficiently large, G\ has maximum degree at most n/8 (and 
positive minimum degree) with probability at least 1/2. That is, the number of such graphs is at least 
hin 2 /2o)' ^ or an y sucn graph, Lemma [3721 tells us that 

r(Gi) < 2i+V < 2t 

2 

for n sufficiently large. On the other hand, by Lemma 13. 11 any graph G2 oniUB with m = edges 
satisfies 

r(G 2 ) > 2f _1 > 2t. 

It follows that for any Gi for which r(Gi) < 2~e there is a graph G with at most ^ edges such that 
G\ C G and G is Ramsey saturated. Otherwise, starting from G\ and adding edges which do not 
increase the Ramsey number one by one, we could find a sequence of graphs G\ C GiU {e} C • • • C G2 

2 

up to a graph G2 with ^- edges such that r(G\) = r(G\ U {e}) = ••• = r(G2), contradicting our 

2 

estimate for the Ramsey number of graphs G2 with ^ edges. 

Since any graph G with at most ^ edges contains at most Q2/20) l a k e led subgraphs of size n 2 /20, we 
see that the number of labeled Ramsey saturated graphs is at least 



1 / n 2 /4 \ 2lf.inf._1l |2__2__i_1 

2\n 2 /20) _l4^4 )"' \ * 20 



(n 2 /5\ 2 n 2 (rfi A ( n 2 n 2 , i 



U 2 /20/ 5^5 / " ' V 5 20 

> i (!) >- ^ 

for n sufficiently large. Dividing through by n\ tells us that the number of non-isomorphic Ramsey 
saturated graphs is, again for n sufficiently large, at least 2 n / 80 , as required. □ 

We note that this proof easily extends to more than 2 colors by using the appropriate analogues of 
Lemmas 13.11 and 13.21 We omit the details. 

Balister, Lehel and Schelp [7j also conjecture that almost all graphs should be Ramsey unsaturated. 
While they prove that this is the case for paths and cycles of length at least five (a result which was 
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recently extended by Skokan and Stein [32]), we feel that the truth probably lies in the other direction, 
that is, that almost all graphs should be Ramsey saturated. However, it would still be very interesting 
to find further classes of Ramsey unsaturated graphs. 

4 Degeneracy and online Ramsey theory 

Online Ramsey games were first introduced by Beck [8] and, independently, by Friedgut, Kohayakawa, 
Rodl, Rucihski and Tetali [23] (see also [33] ). There are two players, Builder and Painter, playing on 
a board consisting of an infinite, independent set of vertices. At each step, Builder exposes an edge 
and, as each edge appears, Painter decides whether to color it red or blue. Builder's aim is to force 
Painter to draw a monochromatic copy of a fixed graph G. 

These games have been studied from multiple perspectives. One variant asks for the least number 
of edges r(G) which Builder needs to force Painter to draw a monochromatic G (see [TJJ and its 
references). Another asks how long the game lasts if the game is played on a board with n vertices 
and Builder chooses the edges at random (see [351 SB] an d their references). 

We will consider another variant, first introduced by Grytczuk, Haluszczak and Kierstead [25]. Sup- 
pose that we have a positive integer-valued increasing graph parameter such as chromatic number, 
degeneracy, treewidth, thickness or genus. The question asked in [25] is whether, for each of these 
properties, there exists a function / : N — > N such that Builder can force Painter to draw a monochro- 
matic copy of any graph G with parameter k while himself only drawing graphs with parameter f(k). 
To give a concrete example, it was proved in [25] (and extended in [30] to any number of colors) that 
Builder may force Painter to draw a monochromatic copy of any graph G with chromatic number k 
while only drawing graphs of chromatic number k himself. 

Here we prove a similar result where the chosen parameter is degeneracy, partially addressing one of 
the questions raised by Grytczuk, Haluszczak and Kierstead [25]. A graph is d-degenerate if every 
subgraph of it has a vertex of degree at most d. Equivalently, a graph G is d-degenerate if there is 
an ordering of the vertices of G, say u\, v,2, ■ ■ ■ , u n , such that for each 1 < i < n the vertex itj has at 
most d neighbors Uj with j < i. The degeneracy of G is the smallest d for which G is d-degenerate. 
We will consider the g-color online Ramsey game, where Painter has a choice of q colors at each step, 
proving the following result. 

Theorem 4.1 In the q-color online Ramsey game, Builder may force Painter to draw a monochro- 
matic copy of any d-degenerate graph while only drawing a (qd — (q — 1))- degenerate graph. 

To prove Theorem 14. It we will need to use the hypergraph version of Ramsey's theorem |2U[ 155] . 

Lemma 4.1 For all natural numbers k,£ and q with i > k, there exists an integer n such that if the 
edges of the complete k-uniform hypergraph on n vertices are q-colored, then there is a monochromatic 
copy of . 

The smallest such number n is known as the Ramsey number r^(t\ q). 

Proof of Theorem I4.lt Suppose that G is a <i-degenerate graph with n vertices. We will show 
that while only drawing a (qd — (q — l))-degenerate graph it is possible for Builder to force Painter 
to construct a sequence of subsets Vq D V± D • • • D Vt an d a sequence of colors Ci, . . . , Ct , where 
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t = q(n — 1) — (q — 1), such that \Vt\ > d and the following holds. For every d-set D in Vi there is a 
vertex vd m. Vj_i such that every vertex w 6 is connected to vd in color Cj. By the choice of t, 
the pigeonhole principle implies that there is a subsequence Vo = Vi D D ■ ■ ■ D l / i n _ 1 such that 
Cjj = • • • = Ci n _ 1 . Without loss of generality, we may assume that this color is red. 

It is straightforward to show that the constructed graph contains a red copy of G. Let u\,...,u n be a 
(i-degenerate ordering of the vertices of G, that is, such that Ui has at most d neighbors Uj with j < i. 
We will construct an embedding / : V{G) — > V of the vertices of G into the red graph constructed 
above by induction, embedding the vertex u p into the set Vi n _ p - We begin by mapping u\ to any vertex 
in Vi n _ x . Suppose now that ui, . . . ,u p have been embedded and we wish to embed u p+ \. We know 
that u p+ i has / < d neighbors u ai , • • • , u af with aj < p. Since the images of each of these vertices 
under the embedding lie in V, n _ p , we see, by taking D = {f(u ai ), . . . , f(u af )}, that there is a vertex 
w G ^ n _ {p+ i) such that the edge between w and f{u aj ) is red for all 1 < j < f. Taking f(u p+ i) = w 
completes the induction. 

It remains to construct the subsets Vo D V\ D • • • D Vt- Let nt = d and, for each 1 < i < t, let 



where s = qd — (q — 1). We begin by taking an empty graph of size no for Vo- Suppose now that 
Builder has forced Painter to construct Vq D Vi D ••• D Vi-i and that V^_i is empty. We will show 
how Builder may force Painter to construct an empty set Vi C Vi-i such that for every (f-set D in Vi 
there is a vertex vd in V^_i\Vi such that every vertex w S D is connected to tJ^ in a fixed color Cj. 

Suppose, therefore, that Vi-\ is empty. By the choice of nj_i, we may partition Vi-\ into two pieces 
Wi and Vi-i\Wi of sizes rrii-i and (" li s ' 1 ), respectively. For each s-set S in Wi, Builder now chooses 
a unique vertex vs in Vi-\\Wi (which is possible by the choice of size for Vi-i\Wi) and joins every 
w G S to vs- Moreover, vs will have no other neighbors in Vi—\. Note that Wi is empty and every 
vertex in V—i\Wi has degree exactly s = qd — (q — 1). 

We now consider the complete s-uniform hypergraph on Wi. Suppose that the vertices of Wi have 
been ordered. For any edge e = {w\, . . . , w s } of this graph with w\ < ■ ■ ■ < w s , we assign it the 
color (x{w\v e ), . . . , x(w s v e )) where v e is the unique vertex in Vi-\\Wi joined to each of.w\,...,w s and 
x(wiV e ) is the color assigned by Painter to the edge between Wi and v e . This gives a q s coloring of 
the edges of the complete s-uniform hypergraph on Wi. By the choice of m^-i, this set must contain 
a monochromatic subgraph of size qd(rii + 1). We call this set Mj. 

For any edge e = {w\, . . . ,w s } in Mi with w\ < ■ ■ ■ < w s , we know that xi w j v e) = Xj f° r a fixed 
sequence of colors %i, . . . ,Xs- By the choice of s, we know that there must be a color Ci and a set 
of indices ji, ■ ■ ■ ,jd such that x{ w j k v e) = Ci for all edges e and all 1 < k < d. Suppose now that 
the vertices in Mj are w[ < ■ ■ ■ < vJ^ where i = qd(rii + 1). Consider the subset of Mj containing 
the vertices w' qd , w' 2qd , w' 3qd , . . . , w' n . qd . Then, for any 1 < a\ < • • • < < rii, there exists an edge 
e = {wi, . . . ,w s } of the complete s-uniform hypergraph on Mj such that Wj k = w' a d . This follows 
since the vertices in the subsequence are a distance qd apart and we may place up to qd — 1 > s dummy 
vertices between any pair (and before and after the first and the last terms in the sequence). Therefore, 
X{w' akq d v e) = Ci for all 1 < k < d. The set Vi = {w qd , w' 2qd , w' 3qd , w' n . qd } has the required property 
that for every d-set D in Vi there is a vertex vd in Vj_i\Vj such that every vertex w G D is connected 
to vd in color Q. 

To show that the graph constructed by Builder has a (qd — (q — l))-degenerate ordering, we note that 
each vertex in Vi-\\Wi has no neighbors in Vi-\\Wi and degree qd — (q — 1) in Wi. Moreover, there 
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are no edges between V{ and Wi\Vi. If, therefore, we choose our ordering so that for all i = 1, 2, . . . ,t, 
the set of vertices in Vi come before those in Wj\Vi and the set of vertices in Wi\Vi come before those 
in Vi-i\Wi, we will have an s-degenerate ordering. This completes the proof. □ 

In the other direction, it is obvious that Painter must draw a graph with degeneracy at least d. It may 
well be the case that this simple lower bound is sharp. This was verified by Grytczuk, Haluszczak and 
Kierstead |25j in the case d = 1. That is, for any fixed number of colors, they showed that Painter 
may force Builder to create a monochromatic copy of any forest while only building a forest. 

It would be very interesting to know whether a similar theorem holds when degeneracy is replaced by 
maximum degree. This question, first studied in [10] . appears difficult. 



5 Generalized Ramsey numbers for hypergraphs 

Given natural numbers s and t with s < t, the Erdos- Rogers function f s ,t{n) is defined as the size 
of the largest i^-free subset that may be found in any 1^-free graph on n vertices. This function 
generalizes the usual Ramsey function, since determining f2,t( n ) is the problem of determining the 
size of the largest independent set which is guaranteed in any Kt-hee graph on n vertices. 

Since its introduction [T5JEI] and particularly in recent years [SJ [TSJ E23 E31 113 IB] , this function has 
been studied quite intensively. Much of this work has focused on the case where t = s + 1. Here the 
best results that are known \17\ [18] are that there are constants c\ and C2 depending only on s such 
that 

/ nlogn \ 1/2 2/3 

ci : : < fs,s+l(ji) < c 2 n i°. 

\ log log n J 

The analogous function for hypergraphs was recently studied by Dudek and Mubayi [T7]. For s < t, 
let fg k ^(n) be given by 

f^(n) = min{max{|VF| : W C V{G) and Q[W] contains no K { s k) }}, 

where the minimum is taken over all K^-hee fe-uniform hypergraphs Q on n vertices. Dudek and 
Mubayi proved that, for k = 3 and 3 < s < t, this function satisfies 

In particular, for t = s + 1, this gives constants c\ and C2 depending only on s such that 

n ^l/4 ( 1 oglog?^ \ 1/2 .( 3 ) , , 

ci(logn) 1 - — - — < / s v ; + iW ^ c 2logn. 

\ log log log n J ' 

Here we make an improvement to the lower bound, using ideas on hypergraph Ramsey numbers 
developed by the authors in JT6j . 

We will need the following lemma, due to Shearer [41 j . which gives an estimate for the size of the 
largest independent set in a sparse l^-free graph. 

Lemma 5.1 There exists a constant c s such that if G is a K s -free graph on n vertices with average 
degree at most d then G contains an independent set of size at least 

log d 
s d log log d 
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The main result of this section is now as follows. 



Theorem 5.1 For any natural number s > 3, there exists a constant c such that 

f(3) ( n )>c( lQgn 
J ^ +l1 ' ~ V log log log n 

(3) 

Proof: Let Q be a -fQ +1 -free 3-uniform hypergraph on n vertices. Let 



P 



i \ 1/3 

log n 



log log log n 



(3) 

where c is a constant to be determined later. Our aim will be to show that Q contains a Kg -free 
subgraph of size at least p. 

We will construct, by induction, a sequence of vertices v±, V2, ■ ■ ■ , V£ and a non-empty set Vi such that, 
for any 1 < i < j < £, all triples {vi,Vj,Vk} with j < k < £ and all triples {vi,Vj,w} with w £ Vi are 
either all edges of Q or all not edges of Q. At each step, we consider the auxiliary graph Gg on vertex 
set {vi, . . . ,vi} formed by connecting Vi and Vj with i < j if and only if all triples {vi, Vj, v^} with 
j < k < £ and all triples {vi, Vj, w} with w £ Vg are edges of Q. We note that Gg must be K s -hee. 

(3) 

Otherwise, if {v^ , . . . , Vi s } are the vertices of a K s , we may take any vertex w in Vi to form a in 
£, namely, {v h ,. ..,v ia ,w}. 

If at any point Gi contains a vertex u with at least » neighbors, we stop the process. Since Gi is 

(3) 

i^ s -free, it follows that the neighborhood U of u in is i^ s _i-free. The set U must also be Kg -free 
in Q. Indeed, suppose otherwise and that {v^, . . . , Vi 3 } with i\ < • • • < i s is a -RT^ with vertices from 
U. Then, by the construction of Gj>, the set {v^, . . . ,Vi a _ 1 } must be a K s _\ in U. We may therefore 
assume that the maximum degree in each Gg is at most p. 

To begin our induction, we fix v\ and let V\ = V(Q)\{v\}. Suppose now that we have constructed 
the sequence v i , V2 , • ■ • , vg and a set Vg satisfying the required conditions and we wish to find a vertex 
V£ + i and a set Ve+i. We let vg + \ be any vertex from the set Vi . Let V^o — Vi\v£+i- We will construct 
a sequence of subsets V^o ^ 3 • • • D V^^ such that all triples {vi,V£ + i,w} with 1 < i < j and 
w € V^j are either all edges of Q or all not edges of G, depending only on the value of i. (Note that 
since each C Vi it follows that all triples {vi,vj, w} with 1 < % < j < £ and w G Vij are either all 
edges of Q or all not edges of G, depending only on the values of % and j.) 

Suppose then that Vij has been constructed in an appropriate fashion. To construct Vij + i, we consider 
the neighborhood of the vertices fj+i and ve+\ in Vgj. If this neighborhood has size at least a|V^,|, 
we let Vg j-fi be this neighborhood. Otherwise, we let V^j+i be the complement of this neighborhood 
in Vij. Note that in this case \Vg > (1 — a)|V^j|. To finish the construction of V^+i, we let 
Vi + i = Ve t g. It is easy to check that it satisfies the required conditions. 

We halt the process when £ = m. Recall that the maximum degree of G m is at most p. Since G m is also 
K s -free, Lemma |5. II implies that there is a constant c s such that the graph contains an independent 
set of size at least 

logp 

c s — ; — ; m = p, 

p log log p 

by choosing m = ^~ l °fogp P P 2 - This in turn implies an independent set of size p in Q, completing the 
proof provided only that \V m \ > 1. To verify this, note that if in Gi the vertex Vi has degree d{ then 

\Vi\ > a d *(l - ay-^dVt-il - 1) > a dl (l - - 1. 
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Telescoping over all i = 2, . . . , m, it follows that, for a < h 



2' 



|^m| > a^ 2<ii (l - a)(") T,T=2 d i n _ m 
> a~ (1 — a)V 2 )~~n — m. 

The second inequality follows by noting that YHL2 di = e(C? m ) < ^rp and observing that the function 
a*(l — a)^ 2 ) - * is decreasing in t for a < |. Therefore, taking a = ^-log(^) (note that a < | for n 
sufficiently large) and using that 2~ 2:E < 1 - x for < x < L 



\V m \ > a 2 [l — a) 2 n — m> (p/m) 2 2 am n — m 
= 2-i pmlo ^ m ^n -m>y/n-m>\. 

In the fourth inequality, we used that for n sufficiently large depending on c s , log(m/p) < logp and, 
therefore, 

3 1 t 1 \ s- 3 3 l °g lo &P 1 3 3 

— p7TllOg(?Tl/pj < p lOgp = p log log p. 

2 2c s logp 2c s 

/ , \l/3 

Hence, since p = c ( log t °g n J , for c sufficiently small we have 

3 1 

— pm\og(m/p) < — logn. 

This completes the proof. □ 

This result easily extends to higher uniformities to give that / s ^ +1 (n) > (log( fc _ 2 ) n) 1 / 3-0 ^ 1 ). Here 
log(o) x = x and log( i+1 ) x = log(log(j) x). This improves an analogous result of Dudek and Mubayi [17] 
with a 1/4 in the exponent but remains far from their upper bound < c Sj fc(log n) 1 /^ -2 ). It 

is an interesting open question to close the gap between the upper and lower bounds. 



6 Clique partitions of very dense graphs 

A clique partition of a graph G is a collection of complete subgraphs of G that partition the edge set 
of G. The clique partition number cp(G) is the smallest number of cliques in a clique partition of G. 
Despite receiving considerable attention over the last 60 years, this interesting graph parameter is still 
not well understood. 

One class of graphs for which this parameter has been studied quite extensively is when the graph G 
is the complement F of a sparse graph F. Orlin |38| was the first to ask about the asymptotics of the 
clique partition number for the complement of a perfect matching. If F is a perfect matching on n 
vertices, Wallis [19] showed that cp(F) = o(n 1+e ) for any e > 0. This was later improved by Gregory, 
McGuinness and Wallis [24] to cp(i ? ) = 0(n log logn). Wallis [50] later showed that the same bound 
holds if F is a Hamiltonian path on n vertices. For any forest Fonn vertices, Cavers and Verstraete 
[11] proved that cp(-F) = O(nlogn). They also conjectured that there are forests F on n vertices such 
that cp(.F) grows superlinearly in n. Here we extend their proof to give the following improvement. 

Theorem 6.1 If F is a forest on n vertices, then cp(F) = 0{n log logn). 
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A Steiner (n, /c)-system is a family of fc-element subsets of an n-element set such that each pair of 
elements appears in exactly one of the subsets. One can view a Steiner (n, /c)-system as a clique 
partition of K n into cliques of size k. It is an open problem to determine for which pairs (re, k) does 
a Steiner (n, /c)-system exist. Necessary conditions for the existence of a Steiner (re, A;)-system are 
n = 1 mod k — 1 and n(n — 1) = mod k(k — 1). Wilson [51] showed that for each k there is n(k) 
such that the necessary conditions for the existence of a Steiner (n, /s)-system are sufficient provided 
that n > n(k). An explicit upper bound on n(k) which is triple exponential in k 2 logk was proved by 
Chang [H]. 

For sparse graphs F, Cavers and Verstraete |11] proved an upper bound on the clique partition 
number cp(F) which is conditional on the existence of certain Steiner systems. They prove that if F 
is a graph on n vertices with maximum degree A = o(nj log 4 n) and there is a Steiner (n, /c)-system 

with k = L(i5;) 1/2 J tnen C P(^) = O (re 3 / 2 A 1 / 2 log 2 n). They also conjectured that if the maximum 
degree of F is o(n) then cp(F) = o(n 2 ). This conjecture follows from the next theorem, which gives 
an unconditional improvement on their bound. 

Theorem 6.2 If F has n vertices and m > y/n edges, then cp(F) = O ((mn) 2 / 3 ) . 

We have the following immediate corollary, showing that we can partition the complement of a sparse 
graph into few cliques. 

Corollary 6.1 If F has n vertices and o(n 2 ) edges, then cp(i ? ) = o(n 2 ). 

To prove Theorem 16.21 we will first prove a useful lemma saying that, for 2 < k < n, we can partition 
the complete graph K n into 0(max((n/fc) 2 , re)) cliques of order at most k. We begin with some 
elementary lemmas. For graphs G and H, let G U H denote the disjoint union of G and H. 

Lemma 6.1 For s,t,n with s + t < n, we have cp(K n \ (K s U Kt)) > si- 
Proof: Indeed, in a clique-partition of K n \ (K s U K t ), each of the st edges between the independent 
set of size s and the independent set of size t must be in different cliques. □ 

Lemma 6.2 If G has n vertices and all but t > 1 vertices of G have degree n — 1, then cp(G) < tn. 

Proof: The clique partition of G consisting of the clique on the n — t vertices of degree n — 1 and a 
K2 for each remaining edge uses at most 1 + t(n — 2) < tn cliques. □ 

In order to prove our main auxiliary lemma, we need to know a little about the particular class of 
Steiner systems known as projective planes. A projective plane consists of a set of points and a set of 
lines and a relation between points and lines called incidence having the following properties: 

• For any two distinct points, there is exactly one line incident to both of them, 

• For any two distinct lines, there is exactly one point incident to both of them, 

• There are four points such that no line is incident with more than two of them. 
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The first condition says that any two points determine a line, the second condition says that any two 
lines intersect in one point and the last condition excludes some degenerate cases. It can be shown 
that a projective plane has the same number of lines as it has points. Any finite projective plane has 
q 2 + q + 1 points for some positive integer q and we denote such a projective plane by P„. Each line is 
incident with q + 1 points and each point is incident with q + 1 lines. Therefore, the lines of P q form 
a Steiner (q 2 + q + 1, q + l)-system. It is known that if q is a prime power then there is a projective 
plane on q 2 + q + 1 points. It is a famous open problem to determine if there exist finite projective 
planes of other orders. 

Lemma 6.3 Let k > 2 and n be positive integers and let f(n,k) denote the minimum number of 
cliques each on at most k vertices needed to clique partition K n . If n < k, trivially f(n,k) = 1. // 
n> k, then 

f(n, k) = (max ((n/fc) 2 , n)) . 



Proof: We first prove the lower bound. The pigeonhole principle implies the lower bound /(n, k) > 
(2)/ (2) — ( n /k) 2 as we n eed to cover (™) edges by cliques each with at most ( 2 ) edges. Now suppose 
y/n < k < n and we have a partition of the edges of K n into cliques each with at most k vertices. Let 
s > t be the sizes of the largest cliques used in the partition, so t > 2. If s < 2y/n, then the number of 
cliques used is at least f(n, s) > n 2 /s 2 > n/A. So suppose s > 2\fn. By Lemma UTTl there are at least 
(s — l)(t — 1) remaining cliques in the partition (the cliques may intersect in one vertex). So suppose 
(s — l)(t — 1) < n/A. Then st < n. Since all cliques besides the largest have at most t vertices the 
number of cliques used is at least 

(0) - (9) (n - s)(n + s - 1) (n-s)(n + s-l) s 2 (n + s-l) 9 
1 + , t , = 1 + " 7^ s > 1 + , , w, , ; r- = 1 + — > 1 + s 2 > n 



Q t(t-l) (n/s)((n/s) - 1) 



n 



as y/n < s and t < n/s. We have thus proved that f(n,k) > max((n/fc) 2 , n/4). 

We now turn to proving the upper bound. We will prove by induction on n that the bound f(n, k) < 
max(200(n/A;) 2 ,4ra) holds. We may assume that k > 20 as otherwise we can clique partition K n into 
(2) — 200(n/k) 2 edges. We will use Bertrand's postulate that there is a prime between x and 2x for 
every x > 1. Much better estimates are known on the distribution of the primes, but this is sufficient 
for our purposes. 

If k > let q be the smallest prime such that q 2 + q + 1 > n. Bertrand's postulate implies 

that q 2 + q + 1 < 4n, and hence q < k. Consider a projective plane P q on q 2 + q + 1 points and 
edge-partition the complete graph on these points into cliques of size q + 1 < k given by the lines of 
P q . There are q 2 + q + 1 < 4n such cliques and, restricting to n of these points, we get in this case 
f (n, k) < q 2 + q + 1 < An. 

If k < 2y/n, let q be the smallest prime which is at least n/k + k. Bertrand's postulate implies that 
q < 2(n/k + k) < 2{n/k + 2\fn) < lOn/k. The Turan graph T n ^ is the complete fc-partite graph on 
n vertices with parts of size as equal as possible. We will find a partition of the edges of T n ^ into 
cliques of size at most k using at most q 2 + q — k + 1 cliques. Let S\,...,Sk denote the k independent 
sets of T n ^. Let L\, L2, ■ ■ ■ , L/~ denote k lines of the projective plane P q . View each Si as a subset of 
points of the line Lj such that the points of Si, . . . , Sf. do not include any of the intersection points of 
Li, . . . , Lfc. We can do this since \Si\ < \n/k~\ , \Li\ = q + 1 > n/k + k + 1, each pair of lines have one 
intersection point, and so each Li has precisely k — 1 intersection points in total with all the other Lj. 
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Every pair of points in Si U . . . U S& other than those inside one of the Sj are contained in exactly one 
of the q 2 + q — k + 1 lines of P q other than L\, . . . , Since each of these lines intersects Tj and hence 
Si in at most one point, these q 2 + q — k + 1 lines give an edge-partition of T n ^ into q 2 + q — k + 1 
cliques each with at most k vertices. This gives the bound 

f(n,k) < q 2 + q-k + l + kf(\n/k],k) < U0(n/k) 2 + k max(200(\ n / k]) 2 /k 2 ,i\n/k]) 
< 110(n/k) 2 + 800n 2 //c 3 + 8n < 200(n//c) 2 , 

where we use 20 < k < 2y/n and that, for each k, f(n,k) is a monotonically increasing function of 
n. This latter fact follows by noting that restricting a clique partition of a clique to a subclique is a 
clique partition of the subclique. This completes the proof by induction on n. □ 

One can easily modify the above proof, using the prime number theorem instead of Bertrand's postu- 
late, to show that f(n, k) = (1 + o(l)) k ^_^ as long as k = o{\/n). 

Proof of Theorem 16. 2t Let F be a graph with n vertices and m > y/n edges and k = (n 2 /m) 1 ' s < 
y/n. By Lemma 16.31 we can partition the complete graph on n vertices into N = O ((n/k) 2 ) cliques 
Qi, . . . , Qn each of order at most k. For each clique Qi let ti be the number of edges of F it contains. 
Each Qi that contains no edge of F will be used in the clique partition of F. If tj > 1, we use Lemma 
16.21 to partition the induced subgraph of F on the vertex set of Qi into 2tj|Qj| < 2tik cliques. Thus, 
we get that F can be partitioned into at most N + 2 ^ fyk = N + 2mk = O ((mn) 2 / 3 ) cliques, which 
completes the proof of Theorem 16.21 □ 

Note that adding an edge to a graph can increase the clique partition number by at most 1. It follows 
that for any forest F, if T is a spanning tree containing F then cp(F) < cp(T) + n — 1. So, to prove 
Theorem 16. 11 it is sufficient to prove it for trees. Let g{n) denote the maximum of cp(T) over all trees 
T on n > 2 vertices. We will prove by induction that 

g(n) < 100n(l + log log n), 

which verifies Theorem 16.11 This inequality clearly holds for n < 200 as g(n) < < 100??, in this 
case. So suppose n > 200. 

A tree partition of a graph G is a collection {Ti, . . . , T r } of subtrees of G such that each edge of G is 
in exactly one tree and, for all i ^ j, Ti and Tj share at most one vertex in common. We will use the 
following simple lemma from |llj . 

Lemma 6.4 Let T be a tree on n vertices and 2 < k < n. Then there exists a tree partition 
{Ti, . . . ,T r } of T into at most 2n/k trees such that the number of vertices of each Ti is between 
k/3 and k. 

Proof of Theorem I6.lt Let k = y/n and apply the above lemma to find a tree partition of T into 
r < 2n/k = 2 y/n trees each with at most k = y/n vertices. Order these trees {Ti, . . . , T r } so that for 
i > 2 the union of Uj=i ^(^j) * s connected. Then there is exactly one vertex Vi of Tj that is contained 
in Uj=i ^C^j)- By Bertrand's postulate, there is a prime q satisfying 3y/n < q < 6y/n. We will show 
that there is a one-to-one mapping of the vertices of T into the points of the projective plane P q on 
q 2 + q + 1 points such that each Tj is contained in a line Lj and no vertices of T other than those in 
Tj map to a point in Lj. 



12 



Let L\ be an arbitrary line in the projective plane P q on q 2 + q + 1 vertices. Arbitrarily embed the 
vertices of T\ into the points of L\ with the vertex v-i identified with some point W2- This can be done 
since \T\\<k<q + l = \L\\. Suppose we have already embedded L\, . . . , Lj_i and let Wi denote the 
image of the vertex V{ which is in at least one line Lj with j < i. Pick an arbitrary line Lj through Wi 
with L{ ^ Lj for j < i. Since there are q+l>r>i — 1 lines through each point, and, in particular, 
through Wi, we can indeed pick such a line Lj. Arbitrarily embed the remaining vertices V(Ti) \ {vi} 
of Ti amongst the points of Lj not in any Lj with j < i. Since any two lines intersect in exactly one 
point Lj has q + 1 — (i — 1) > q + 2 — r > k > |Tj| points not in any Lj with j < i. Thus we can 
indeed embed these remaining vertices. This demonstrates that we can find the desired mapping of 
the vertices of T into the points of P q . 

We next construct a clique partition of T. For each line of the projective plane not containing an edge of 
T, we use the corresponding clique restricted to the vertices of T. For each line of the projective plane 
containing an edge of T, this line contains the vertices of Tj and no other vertex of T, so we use at most 
g(|Tj|) cliques to partition the edges of Tj. Note that Yli=i 1^1 = |L| + r — 1 < n + 2n/k — 1 < n + 2y/n. 
Using this estimate, together with the inequalities |Tj| < y^n, n > 200 and the induction hypothesis, 
we get 

r 

g(n) < q 2 + q + l + ^2g{\Ti\) < 45n + 100(n + 2v^)(l + log log y 7 ^) 

i=l 

< 45n + 100(n + 2 v / n)loglogn < 100n(l + log log n), 
as required. □ 

7 Hilbert cubes in dense sets 

A Hilbert cube or an affine cube is a set H C N of the form 

H = H(x ,x 1 ,. ..,x d ) = jx + y~] Xj : I C [d] j , 

where xq is a non- negative integer and xi,...,Xd are positive integers. We will always assume here 
that the generators xi, . . . ,xa are all distinct. We refer to the index d as the dimension. 

One of the earliest results in Ramsey theory is a theorem of Hilbert stating that if n is sufficiently 
large then any coloring of the set [n] with a fixed number of colors, say r, must contain a monochromatic 
Hilbert cube of dimension d. The smallest such n we denote by h(d, r). The best known upper bound 
for this function is 

h(d,r) < {2rf d ~\ 

As noted in [9], a double exponential upper bound already follows from Hilbert's original argument. 
The result we have quoted is a slight strengthening noted by Gunderson, Rodl and Sidorenko 
which follows from a stronger density statement. 

This density version states that for any natural number d and 5 > there exists an no such that if 
n > no any subset of [n] containing 5n elements contains a Hilbert cube of dimension d. Quantitative 
versions of this lemma imply that any subset of [n] of fixed positive density 5 contains a Hilbert cube 
of dimension at least d log log n, where d depends only on 5. This in turn gives a double exponential 
upper bound for the original coloring problem. 
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On the other hand, Hegyvari [28] gave a lower bound by proving that with high probability a ran- 

dom subset of [n] of small but fixed positive density 5 does not contain Hilbert cubes of dimension 
c-v/log n log log n, where c depends only on 6. Here we improve this result as follows. 

Theorem 7.1 For any < 5 < 1, there exists c > such that with high probability a random subset 
of the set [n], where each element is chosen independently with probability 8, does not contain Hilbert 
cubes of dimension C\/log n. 

By another result of Hegyvari [28], this theorem is sharp up to the constant c. That is, with high 
probability, dense random subsets of [n] contain Hilbert cubes of dimension c'\/\og n for some d 
depending on the density. 

Let X be a set with elements 1 < x\ < xi < ■ ■ ■ < Xd and write 



Note that a Hilbert cube is just a translation of an appropriate The following basic estimates 

for |E(X)| will be useful to us. For a proof, see [27] . 

Lemma 7.1 For any set X with d elements, 



The main new ingredient that we use is the following inverse Littlewood-Offord theorem due to Tao 
and Vu [17] (see also |37j and |48] for improved versions). This says that if |£(AT)| is small then X 
must be highly structured. A generalized arithmetic progression or GAP for short is a subset Q of 7L 
of the form 



We refer to r as the rank of the GAP Q and the product A\ . . . A r as the volume of Q. Rephrased in 
our terms, the inverse Littlewood-Offord theorem states that if is small then almost all of X 

must be contained in a GAP Q of low rank. 

Lemma 7.2 For every C > and < e < 1, there exist positive constants r and C such that if X 
is a multiset with d elements and |S(A)| < d c then there is a GAP Q of dimension r and volume at 
most dP such that all but at most elements of X are contained in Q. 

The following lemma is the key step in our proof. 

Lemma 7.3 For s < logd, the number of d- sets X C [n] with \T,(X)\ < 2 s d 2 is at most n°^d°^ d \ 

Proof: Let C = 3, e = 1/2 and r and C be the constants given by Lemma l7.2i If a d-set X satisfies 
|S(A)| < 2 s d 2 < d 3 , then, by Lemma [7.21 it has a subset X' C X which lies in a GAP Q of dimension 
r and volume at most d c ' such that \X \ X'\ < d l ~ e = d 1 ! 2 . 





Q = {xq + a\X\ + • • • + a r x r : 1 < o« < A{\ . 
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The number of possible choices for Q is d°^n°^ since it is of constant dimension. For each such Q, 
the number of subsets of Q of size at most d is at most (j^j) < \Q\ d = d°^ d \ Thus, the number of 
choices of X' is at most d°^n°( l \ 

Let m = \X\X'\ and order the elements of X\X' as X\ < ■ ■ ■ < x m . Consider the nested sequence Xq C 
X x C • • • C X m of sets where X = X' and X i+1 = X { U {x i+1 }, so X m = X. If |E(X i+1 )| < 2|E(Xj)|, 
then x i+1 is in the difference set EpQ)-E(Xj) and there are at most |S(X;)| 2 < |E(X)| 2 < 2 2s d 4 < d 6 
such choices for Xj+i. Otherwise, |E(Xj +1 )| = 2|E(Xj)| and there are at most n choices for Xj+i. 

Note that, by LemmaEU we have |E(X')| > ( |X 2 +1 ) + 1 > d 2 /4. Suppose there are t elements 
where |S(X m )| = 2|E(X 4 )|. We have t < s + 2 as |E(X )| = |E(X')| > d 2 /A and 2 s d 2 > |E(X)| = 
|E(X m )| > 2 t |E(X )|. 

Thus, after selecting the at most s + 2 indices i for which = 2|E(Xj)|, we have that the 

number of possible d-sets X C [n] such that |E(X)| < 2 s d 2 is at most 

d O(d) n O(i)( m )n s+2 d 6m = d ^n°^. 
\<s + 2J 

This completes the proof of the lemma. □ 



Proof of Theorem I7.lt Let A be the random set formed by choosing each element independently 
with probability 6 and let E be the event that A contains a Hilbert cube of dimension d = cylogn. 
We wish to show that the probability of the event E tends to zero with n. 

Let mt denote the number of d-sets X C [n] with |E(X)| = t. We have 

x 

= n^m/, 
t 

t«P t>d? 

Note that the extra factor of n comes from summing over all possible first coordinates xq for the Hilbert 
cube. The total number of choices for X is at most n d by choosing the basis elements arbitrarily. Hence, 
the second term is at most n ■ n d ■ 5 di = o(l), where we use d = c-^/log n and c is a sufficiently large 
constant depending on 8. 

We next bound the first term above. Let a s = 2 s ~ 3 d 2 and n s = Ylt^a s 1 m t- We use Lemma [7 . 3 1 which 
gives n s < n using a dyadic partition of the sum, the first term is at most 

2+log d 2+log d 

n ns5 2S ' 3d2 < n J2 n°^d°^5 2S - Zd2 = o(l), 

s=l s=l 

where the last estimate uses d = c\/logn and c is a sufficiently large constant depending on 5 to ensure 
that each of the terms in the sum is o{n~ 2 ). 

It therefore follows that ¥[E] tends to zero as n tends to infinity, completing the proof. □ 

By considering a random r-coloring of [n] and carefully checking the dependence of the constant factor 
on 5 in the above proof, we have the following corollary. 



15 



Corollary 7.1 There exists a constant c such that 

h(d,r) > r cd \ 

We will not be able to improve this further unless we can improve the bound on the van der Waerden 
number W(k, r), the smallest number which guarantees a /c-term AP in an r-coloring of [W(fc,r)]. 
Indeed, the lower bound on W(k,2) is exponential in k and W(d 2 ,r) > h(d,r) as a <i 2 -term AP 
contains a d-cube. It seems plausible that W(d 2 ,2) and hence h(d,2) grow as exponential in d 2 . 
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